<HTML> 
<HEAD> 
<TITLE> A Joy interpreter written in Joy</TITLE> 
</HEAD> 
<BODY> 
<I> by Manfred von Thun</I> 
<P>
This short paper contains the design of a Joy interpreter 
written in Joy itself. 
<P> 
Let  L1  and  L2  be two languages. 
An interpreter for language  L1 , written in language  L2  
is a program  P  which takes as a parameter any program written 
in  L1  and executes it in the same way as a processor for  L1  would have. 
If the two languages are the same language  L , 
such an interpreter is called <EM>metacircular</EM>. 
Languages in which <EM>program = data</EM> 
make it particularly easy to write an interpreter 
in its own language. 
The best known example is the <EM>Lisp</EM> interpreter <CODE>eval</CODE>. 
A metacircular interpreter for Joy can also be written. 
<P> 
The remainder of this paper assumes some familiarity with Joy. 
<H1> A Joy interpreter written in Joy </H1> 
<P> 
A Joy interpreter in Joy is a program 
which expects a quoted program on top of the stack and executes it. 
The following is the design of Joy interpreter <KBD>joy</KBD> 
written in Joy itself. 
The first version of the interpreter is merely a reminder that 
Joy already has a combinator, 
the <CODE>i</CODE> combinator, 
which removes a program from the top of the stack and executes it. 
<PRE> 
joy  ==  i 
</PRE> 
<P> 
The next version makes explicit the fact that Joy programs 
are sequences which are executed by <KBD>step</KBD>ping through the 
members of the sequence. 
For each member in the sequence an appropriate action 
is taken. 
The <CODE>step</CODE> combinator will indiscriminately 
leave literals, operators and combinators on top of the stack. 
But any operators and combinators <EM>on top of the stack</EM> 
cannot actually be executed. 
However, their <KBD>unitlist</KBD> can be executed by 
the <KBD>i</KBD> combinator. 
So this is the next version of the interpreter: 
<PRE> 
joy  == 
        [ unitlist 
          i ] 
        step 
</PRE> 
<P> 
The last interpreter does not actually <EM>specify</EM> 
what is to be done what a particular element of the sequence 
has appeared on top of the stack. 
A better one should say things like this for the <EM>operator</EM>s: 
<PRE>
If the top element is "+" 
    then pop off the "+" and add the two numbers below 
If the top element is "rest" 
    then pop off the "rest" and take the rest of the aggregate below 
</PRE>
For the <EM>literal</EM>s it is even simpler: 
<PRE>
If the top element is any number 
    then leave it there and do not do anything 
If the top number is any list 
    then leave it there and do not do anything 
</PRE>
Essentially we need a way of examining what is on top of the 
stack and executing the appropriate action. 
So in the previous version of the interpreter 
we must replace the <CODE>unitlist</CODE> 
by something more specific. 
It has to consist of several cases, 
for the various literals, operators and combinators. 
The <KBD>opcase</KBD> operator is suitable for just that. 
It expects any kind of item on top of the stack, 
and above that a list of cases. 
Each case is a list consisting 
of a test item and a (possibly) empty rest. 
The effect of the <CODE>opcase</CODE> operator 
is to remove the top item and the list of cases, 
and to leave behind the rest of case for which 
the item matched the test item. 
The last case in the list is the default, 
it does not have a test item. 
The default is returned if there was no match. 
For the present design step the default does nothing. 
The following is what has to replace the <CODE>unitlist</CODE> 
in the interpreter: 
<PRE> 
      [ [ 0 ]                           (* sample number *) 
        [ [] ]                          (* sample list *) 
        [ + pop + ]                     (* addition operator *) 
        [ rest pop rest ]               (* rest operator *) 
        [ ] ]                           (* default, do nothing *) 
      opcase 
</PRE> 
It is an easy matter to add the other cases for <EM>literal</EM>s. 
They have to be treated just like numbers and list: 
<PRE> 
        [ 'A ]                          (* sample character *) 
        [ true ]                        (* sample truth value *) 
        [ "" ]                          (* sample string *) 
        [ {} ]                          (* sample set *) 
</PRE> 
Similarly, other operators have to be added such as 
<PRE> 
        [ swap pop swap ]               (* swap operator *) 
        [ cons pop cons ]               (* cons operator *) 
</PRE> 
<P> 
For the combinators it is tempting to treat them just like 
operators: 
<PRE> 
        [ dip pop dip ]                 (* dip combinator - WRONG *) 
        [ map pop map ]                 (* map combinator - WRONG *) 
</PRE> 
This will work correctly, but 
it just uses the Joy system 
<EM>inside</EM> the quoted program that is being called by 
<CODE>i</CODE> or <CODE>map</CODE>. 
Instead it should use the Joy-in-Joy interpreter that we are writing. 
To achieve that effect, 
the program parameter <CODE>[P]</CODE> for the combinator 
has to be replaced by <CODE>[[P] joy]</CODE>. 
For <CODE>i</CODE>, <CODE>dip</CODE> and <CODE>map</CODE> 
and other <EM>unary combinator</EM>s this is quite easy: 
after the <CODE>pop</CODE> execute <CODE>[joy] cons</CODE>. 
This gives cases like the following: 
<PRE> 
        [ i    pop [joy] cons  i   ]    (* i combinator *) 
        [ dip  pop [joy] cons  dip ]    (* dip combinator *) 
        [ map  pop [joy] cons  map ]    (* map combinator *) 
</PRE> 
The case for the <CODE>i</CODE> combinator 
is unnecessarily inefficient, 
it could be optimised to 
<PRE> 
        [ i    pop             joy ]    (* i combinator *) 
</PRE> 
However, for uniformity this optimisation will not be used here. 
<P> 
So that we do not lose track, 
here is the interpreter as designed so far: 
<PRE> 
joy  ==                                 (* literals *) 
    [ [ [ 0    ] 
        [ []   ] 
        [ true ] 
        [ 'A   ] 
        [ ""   ] 
        [ {}   ] 
                                        (* operators *) 
        [ +             pop     +    ] 
        [ rest          pop     rest ] 
        [ dup           pop     dup  ] 
        [ swap          pop     swap ] 
        [ pop           pop     pop  ] 
        [ -             pop     -    ] 
        [ and           pop     and  ] 
        [ cons          pop     cons ] 
                                        (* unary combinators *) 
        [ i             pop [joy] cons    i      ] 
        [ dip           pop [joy] cons    dip    ] 
        [ map           pop [joy] cons    map    ] 
        [ filter        pop [joy] cons    filter ] 
        [ ] ]                           (* provisional default *) 
      opcase 
      i ] 
    step 
</PRE> 
<P> 
The interpreter is getting close to its final shape now, 
but several things need to be fixed. 
Obviously the <EM>binary combinator</EM>s 
have to be treated in a way that is similar to the unary ones: 
The two program parameters <CODE>[P]</CODE> and <CODE>[Q]</CODE> 
have to be replaced by <CODE>[[P] joy]</CODE> and <CODE>[[Q] joy]</CODE>. 
This is best done by using the <KBD>app2</KBD> combinator as follows: 
<PRE> 
        [[joy] cons]  app2 
</PRE> 
So for the binary combinators the cases look like this: 
<PRE> 
                                        (* binary combinators *) 
        [ branch        pop [[joy] cons] app2   branch ] 
        [ cleave        pop [[joy] cons] app2   cleave ] 
</PRE> 
For the <EM>ternary combinator</EM>s 
and <EM>quaternary combinator</EM>s the pattern is much the same, 
there are now three or four program parameters that need to be 
modified. 
This is easily done using the <KBD>app3</KBD> and <KBD>app4</KBD> 
combinators to effect the modification. 
<PRE> 
                                        (* ternary combinators *) 
        [ ifte          pop [[joy] cons] app3   ifte ] 
                                        (* quaternary combinators *) 
        [ linrec        pop [[joy] cons] app4   linrec ] 
        [ binrec        pop [[joy] cons] app4   binrec ] 
</PRE> 
<P> 
There are still two major amendments needed for the interpreter. 
The first concerns user defined symbols as they might occur 
in the standard library, 
the personal library 
or in the preamble to a particular run. 
All Joy symbols have an internal tag, and 
the tags differ individually only for the 
operators and combinators. 
However, <EM>all</EM> numbers have the same internal tag, 
<EM>all</EM> characters have the same tag, 
<EM>all</EM> lists have the same tag and so on. 
Similarly  <EM>all</EM> defined symbols have the same tag. 
The <CODE>opcase</CODE> operator looks at these tags, 
so all that is needed is one new case for user defined symbols. 
Just as any number will do as the prototype for numbers, 
so any user defined symbol will do as the prototype for defined symbols. 
We might as well choose "joy". 
When a defined symbol has been encountered, 
it is necessary to find its definition, 
which is the program that constitutes the body of that definition. 
The <KBD>body</KBD> operator will find that, 
it expects a user defined symbol on top of the stack 
and returns the defining program in quoted form. 
The Joy interpreter now has to execute this. 
But it is essential that the Joy interpreter 
should execute inside that quotation. 
So it will not do to use the <CODE>i</CODE> combinator, 
but the <CODE>joy</CODE> interpreter itself has to be used. 
Hence the case for user defined symbols is just 
<PRE> 
        [ joy           body joy ] 
</PRE> 
The single most common case will be the call of 
a defined atom rather than an inbuilt one. 
To improve efficiency this case (with <CODE>joy</CODE>) is placed to the front 
of the caselist. 
<P> 
It always possible 
that the interpreter is used for programs 
which contain operators or combinators 
that are part of the language (not used defined) 
but have not been given cases in the interpreter itself, 
either intentionally or through neglect. 
The joy interpreter should be able 
to perform reasonably for those operators and combinators. 
So this is where the default clause for the <CODE>opcase</CODE> 
operator comes in handy. 
When none of the listed case apply, 
treat the symbol as we did in the second version of the interpreter: 
take the <CODE>unitlist</CODE> and use the <CODE>i</CODE> combinator 
to execute that. 
Instead of using <CODE>unitlist</CODE> it is better to use 
its definition <CODE>[] cons</CODE>. 
So the default clause looks like this: 
<PRE> 
        [               [] cons i ]     (* default *) 
</PRE> 
<P> 
This completes the working design of the interpreter. 
The following is its structure with the last two additions 
written out fully: 
<PRE> 
joy  == 
    [ [ [ joy           body joy ]      (* user defined *) 
        ...                             (* literals *) 
        ...                             (* operators *) 
        ...                             (* unary combinators *) 
        ...                             (* binary combinators *) 
        ...                             (* ternary combinators *) 
        ...                             (* quaternary combinators *) 
        [               [] cons i ] ]   (* default *) 
      opcase 
      i ] 
    step 
</PRE> 
<P> 
Before we write out a more complete version of the interpreter, 
it is useful to make a number of changes to the design. 
<P> 
Firstly, the cases for the combinators of various arities 
contain common code which is repeated again and again. 
The interpreter becomes more readable if such instances are 
factored out and defined separately. 
There are cases for unary, binary, ternary and quaternary combinators, 
and for each of these the common code will be called 
<CODE>cr1</CODE>, <CODE>cr2</CODE>, <CODE>cr3</CODE> and <CODE>cr4</CODE>. 
For uniformity the code for <CODE>cr1</CODE> is assimilated 
to that of the others, using <KBD>app1</KBD>. 
Since they are not likely to be wanted anywhere else, 
their definitions are given inside a <KBD>HIDE</KBD> declaration. 
<P> 
Secondly, 
it could be useful if the default case does its job not 
silently but traces out each symbol that it hands over 
to the Joy system. 
Such symbols are <CODE>dup</CODE>licated and then written out by <KBD>put</KBD>. 
This now makes the default case 
<PRE> 
        [               dup put [] cons i ] 
</PRE> 
<P> 
The interpreter <KBD>joy</KBD> now looks like this: 
<PRE> 
HIDE 
  cr1  ==  pop [[joy] cons] app1; 
  cr2  ==  pop [[joy] cons] app2; 
  cr3  ==  pop [[joy] cons] app3; 
  cr4  ==  pop [[joy] cons] app4 
IN 
  joy  == 
    [ [ [ joy           body    joy       ] 
        [ []                              ] 
        [ 0                               ] 
        ... 
        [ dup           pop     dup       ] 
        [ +             pop     +         ] 
        [ cons          pop     cons      ] 
        [ put           pop     put       ] 
        ... 
        [ i             cr1     i         ] 
        [ dip           cr1     dip       ] 
        [ map           cr1     map       ] 
        [ filter        cr1     filter    ] 
        [ app1          cr1     app1      ] 
        [ app2          cr1     app2      ] 
        ... 
        [ ifte          cr3     ifte      ] 
        [ linrec        cr4     linrec    ] 
        [ binrec        cr4     binrec    ] 
        ... 
        [               dup put [] cons i ] ] 
      opcase 
      i ] 
    step 
END 
</PRE> 
<P> 
The interpreter has appropriate cases for all literals and 
for quite a few operators and combinators. 
Most operators and combinators are still missing 
and will therefore be handled by the default clause. 
However, it is straightforward to make the interpreter 
more comprehensive and even complete. 
<P> 
It is of some interest to write an interpreter that is 
just adequate to interpret itself and leaves 
everything else to the default clause. 
The following is the minimal Joy interpreter <KBD>joy0</KBD>; 
for variety it uses the optimisation for the <CODE>i</CODE> combinator 
mentioned earlier. 
<PRE> 
joy0  == 
    [ [ [ joy0          body            joy0     ] 
        [ []                                     ] 
        [ pop           pop             pop      ] 
        [ cons          pop             cons     ] 
        [ opcase        pop             opcase   ] 
        [ body          pop             body     ] 
        [ i             pop             joy0     ] 
        [ step          pop [joy0] cons step     ] 
        [               [] cons         i        ] ] 
      opcase 
      i ] 
    step 
</PRE> 
<P> 
The two versions <KBD>joy</KBD> and <KBD>joy0</KBD> 
are in the file
<A HREF="joy/jp-joyjoy.joy"> Joy in Joy </A>.
<P>
Back to
<A HREF = "http://www.latrobe.edu.auj00syn.html">
   Synopsis of the programming language Joy </A>
</BODY> 
</HTML> 
